-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve typing #57
base: main
Are you sure you want to change the base?
Improve typing #57
Conversation
Thank you very much for the PR. I think this is a good addition. Personally I use static typing only very selectively so my personal preference would be to use a non-strict mode just so we can benefit from some type checks without "cluttering" the whole code base.
Personally I'm not paying too much attention to CPython's EOL dates. To me the RHEL life cycle is way more important. For me, Python 2.7 just went EOL as RHEL 7 support ended a week ago. Python 3.6 is still the default for RHEL 8 and is supported until 2029. Disclaimer: These are just my personal preferences. If there are other contributors willing to invest their own time, I'm open to drop older versions/adapt my personal preferences. @caseyjhol let me know if you like to keep support for Python 3.6-3.8.
Could you explain a bit more why we have to drop dotmap? That is something that will cause some issues for our integration with mjml so I would really like to keep it. I would really like to keep attribute access for returned instances. Please note I also got push access for dotmap so we can improve the code there if necessary. |
I wouldn't say strict mode necessarily "clutters" the code base (based on my experience from previous projects I've added typing support to), but I'll for sure start with adhering to non-strict mode, and we can see where to go from there then. Why I normally prefer strict mode is that every function must be typed, which itself adds a bit more confidence in the implementations of functions.
Ah, gotcha. I guess we can try to keep supporting Python 3.6-3.7, and just not run mypy for these versions. I believe I'm not super familiar with setuptools (I'm mostly using Poetry nowadays), so I might require some assistance with making sure the project files are configured correctly.
I dropped it at least for the time being, in order to better grasp what is accessed from where (i.e. avoiding a bunch of
Could you expand a bit on this? On the current version of the branch I replaced the return type of the main
Nice! The main issue here was that dotmap is entirely untyped. Initially I started investigating how much work it would be to add type hints to dotmap as well, but ended up foregoing that plan for the time being, since:
|
As a side note, I've also experimented with rewriting the parser from scratch using only Python built-ins:
Right now both of these functions in this package mostly mirror the reference implementation, but could be greatly simplified if manual parsing and tree building were avoided (i.e. reducing custom string interpolation which may or may not be correct). If you don't mind, I'd replace the existing implementations for these functions as well. |
just a quick reply in between meetings:
If Python 3.6 can be supported easily, it would be nice but don't spend much time on that. As I mentioned for my own use cases a baseline of 3.9 is good enough now that RHEL 7 is EOL.
No, dotmap 1.3.25 started using f-strings so they dropped Python 2.7 compatibility. (That was before I had push access.) Don't worry too much about old versions here.
Better/simpler parsing would be welcome for sure. There are some edge cases which bit me in the past before we used beautifulsoup but I'm absolutely open to simpler implementations. |
Alright: I'll work on this as if 3.8 is the minimum version (it's no overhead over 3.9), and we can circle back to 3.6 later, if deemed necessary.
You're right, didn't notice that, and I simply looked at the 2.7-esque function names like
It'd be interesting to hear about some of these edge cases, if you happen to remember any? I think the potential vulnerabilities listed here don't really apply for valid MJML, and I had to do some escaping to pass the tests for the official parser anyways. |
I think we had two issues with the initial parser: How do you plan to proceed with this PR? Do you want to add more changes or are you waiting on more detailed feedback? Also we can consider splitting up the PR and merging the improvements incrementally. |
This seems to be a non-issue with my implementation, or at least I added for the the following MJML, and it appears to parse correctly: <mjml>
<mj-body>
<mj-section>
<mj-button href="/">a b</mj-button>
<mj-text> x©y </mj-text>
</mj-section>
</mj-body>
</mjml> ...and this is something I've been thinking about, but currently all whitespace is unaffected, so e.g. <mj-text>
this is a string
</mj-text> ...would be parsed to a node with the inner content: "\n this is a string\n" This is something that the HTML rendering engine handles in its own way anyways, and I'm not entirely convinced that it would be worth it to implement an entire "whitespace minimiser" anyways (since even this doesn't apply to all elements, such as
I've been experimenting with a "from-scratch" minimal implementation in order to figure out the best approach to achieving the best type coverage, so it'll take at least to the end of this week for me to have some MVP of that which could eventually lead to a fully typed version of this package.
Definitely: I could open a smaller PR with stubs for the public interface within the next few days, at least to resolve #50, and after that gradually continue on this PR to improve the internal type coverage, if that sounds reasonable to you? |
@xoudini Nice work here! Thanks for the contribution. I'm okay with dropping support of older versions of Python. What are the advantages to moving away from BeautifulSoup? If there are performance benefits, I'd be curious to see if we could quantify them. My vote would be to continue using the well-maintained BeautifulSoup as a dependency instead of adding another thing for us to have to maintain ourselves. If we're going to attempt to reinvent the wheel, I want to make sure there's enough advantage to doing so. |
Great! Then I'll treat Python 3.8 as the minimum baseline for now.
Mainly just one less dependency to worry about, but certainly it may have its place. My (potentially naïve) view on the topic is that HTML (and even more so MJML, apart from whatever is inside Performance considerations haven't been on the top of my list thus far, but I could certainly do some benchmarking later.
Agreed. Based on previous discussion, I'll keep the parser as-is for now, and we can re-open that can of worms in a separate PR (exclusively opened for that purpose). |
The goal of this branch is to add complete typing to the package, for two main reasons:
I've started to do some exploration within the code base, with the initial goal to pass mypy checks in non-strict mode, and the ultimate goal to satisfy mypy in strict mode. There's a GitHub Actions workflow for tracking progress in that regard.
Some downsides:
dict
s, at least temporarily, in order to avoid a bunch of# type: ignore
comments. It may be possible to re-introduce it (or another similar package) once the type check passes.Let me know if you're interested in me pursuing this goal!